27 research outputs found
LENTA: Longitudinal Exploration for Network Traffic Analysis from Passive Data
In this work, we present LENTA (Longitudinal Exploration for Network Traffic Analysis), a system that supports the network analysts in the identification of traffic generated by services and applications running on the web. In the case of URLs observed in operative network, LENTA simplifies the analyst’s job by letting her observe few hundreds of clusters instead of the original hundred thousands of single URLs. We implement a self-learning methodology, where the system grows its knowledge, which is used in turn to automatically associate traffic to previously observed services, and identify new traffic generated by possibly suspicious applications. This approach lets the analysts easily observe changes in network traffic, identify new services, and unexpected activities. We follow a data-driven approach and run LENTA on traces collected both in ISP networks and directly on hosts via proxies. We analyze traffic in batches of 24-hours worth of traffic. Big data solutions are used to enable horizontal scalability and meet performance requirements. We show that LENTA allows the analyst to clearly understand which services are running on their network, possibly highlighting malicious traffic and changes over time, greatly simplifying the view and understanding of the network traffic
LENTA: Longitudinal Exploration for Network Traffic Analysis
In this work, we present LENTA (Longitudinal Exploration for Network Traffic Analysis), a system that supports the network analysts to easily identify traffic generated by services and applications running on the web, being them benign or possibly malicious. First, LENTA simplifies analysts' job by letting them observe few hundreds of clusters instead of the original hundred thousands of single URLs. Second, it implements a self-learning methodology, where a semi-supervised approach lets the system grow its knowledge, which is used in turn to automatically associate traffic to previously observed services and identify new traffic generated by possibly suspicious applications. This lets the analysts easily observe changes in the traffic, like the birth of new services, or unexpected activities. We follow a data driven approach, running LENTA on real data. Traffic is analyzed in batches of 24-hour worth of traffic. We show that LENTA allows the analyst to easily understand which services are running on their network, highlights malicious traffic and changes over time, greatly simplifying the view and understanding of the traffic
A method for exploring traffic passive traces and grouping similar urls
Computer security method for the analysis of passive traces of HTTP and HTTPS traffic on the Internet, with extraction and grouping of similar Web transactions automatically generated by malware, malicious services, unsolicited advertising or other, comprises at least the following processing and control steps: a) URLs extraction from an operational network, using passive exploration of the HTTP e HTTPS traffic data and subsequent collection into batches of the extracted URLs; b) detection of similar URLs, by metrics calculation based on the distance among URLs, namely based on a measure of the degree of diversity among pairs of character strings composing the URLs; c) activation of one or more clustering algorithms used to group the URLs based on the similarity metrics and to obtain, within each group of URLs, elements with similar/homogeneous features, adapted to be analyzed as a single entity; d) visualization of elements according to a sorting based on the degree of cohesion of the URLs contained in each grouping
Collaboration vs. choreography conformance in BPMN
The BPMN 2.0 standard is a widely used semi-formal notation to model
distributed information systems from different perspectives. The standard makes
available a set of diagrams to represent such perspectives. Choreography
diagrams represent global constraints concerning the interactions among system
components without exposing their internal structure. Collaboration diagrams
instead permit to depict the internal behaviour of a component, also referred
as process, when integrated with others so to represent a possible
implementation of the distributed system.
This paper proposes a design methodology and a formal framework for checking
conformance of choreographies against collaborations. In particular, the paper
presents a direct formal operational semantics for both BPMN choreography and
collaboration diagrams. Conformance aspects are proposed through two relations
defined on top of the defined semantics. The approach benefits from the
availability of a tool we have developed, named C4, that permits to experiment
the theoretical framework in practical contexts. The objective here is to make
the exploited formal methods transparent to system designers, thus fostering a
wider adoption by practitioners
A Survey on Big Data for Network Traffic Monitoring and Analysis
Network Traffic Monitoring and Analysis (NTMA) represents a key component for network management, especially to guarantee the correct operation of large-scale networks such as the Internet. As the complexity of Internet services and the volume of traffic continue to increase, it becomes difficult to design scalable NTMA applications. Applications such as traffic classification and policing require real-time and scalable approaches. Anomaly detection and security mechanisms require to quickly identify and react to unpredictable events while processing millions of heterogeneous events. At last, the system has to collect, store, and process massive sets of historical data for post-mortem analysis. Those are precisely the challenges faced by general big data approaches: Volume, Velocity, Variety, and Veracity. This survey brings together NTMA and big data. We catalog previous work on NTMA that adopt big data approaches to understand to what extent the potential of big data is being explored in NTMA. This survey mainly focuses on approaches and technologies to manage the big NTMA data, additionally briefly discussing big data analytics (e.g., machine learning) for the sake of NTMA. Finally, we provide guidelines for future work, discussing lessons learned, and research directions
UMAP: Urban Mobility Analysis Platform to Harvest Car Sharing Data
Car sharing is nowadays a popular transport means in smart cities. In particular, the free-floating paradigm lets the users look for available cars, book one, and then start and stop the rental at their will, within the city area. This is done by using a smartphone app, which in turn contacts a web-based backend to exchange information. In this paper we present UMAP, a platform to harvest data freely made available on the web to extract driving habits in cities.
We design UMAP to fetch data from car sharing platforms in real time, and process it to extract more advanced information about driving patterns and user’s habits while augmenting data with mapping and direction information fetched from other web platforms. This information is stored in a data lake where historical series are built, and later analyzed using easy to design and customize analytics modules.
We prove the flexibility of UMAP by presenting a case of study for the city of Turin. We collect car sharing usage data over 50 days, and characterize both the temporal and spatial properties of rentals, as well as users’ habits in using the service, which we contrast with public transportation alternatives. Results provide
insights about the driving style and needs, that are useful for smart city planners, and prove the feasibility of our approach
Proceedings of the workshops co-organized with the 13th IFIP WG 8.1 working conference on the Practice of Enterprise Modelling (PoEM 2020)
Workshops co-organized with the 13th IFIP WG 8.1 working
conference on the Practice of Enterprise Modellin
A formal approach to decision support on Mobile Cloud Computing applications
Mobile Cloud Computing (MCC) is an emergent topic growths
with the explosion of the mobile applications. In MCC systems,
application functionalities are dynamically partitioned
between the mobile devices and cloud infrastructures. The
main research direction in this field aims at optimizing different
metrics, like performance, energy efficiency, reliability
and security, in a dynamic environment in which the MCC
application is located. Optimization in MCC refers to taking
advantages from the offloading process, that consists in
moving the computation from the local device to a remote
one. The biggest challenge in this aspect is to define a strategy
that is able to decide when offloading and which part of
the application to move. This technique, in general, improves
the efficiency of a system, although sometimes it can lead to
a performance degradation.
To decide when and what to offload, in this thesis we propose
a new general framework supporting the design and the
runtime execution of applications on their own MCC scenarios.
In particular the framework provides a new specification
language, called MobiCa, equipped with a formal semantics
that permits to capture all characteristics of a MCC system.
Besides the strategy optimization achieved by exploiting the
potentiality of the model checker UPPAAL, we propose a set
of methods for determining optimal finite/infinite schedules.
They are able to manage the resource assignment of components
with the aim of improving the system efficiency in
terms of battery consumption and time. Furthermore, we
propose two optimized scheduling algorithms, developed in
Java, based on the exploitation of parallel computation in order
to improve the system performance
Machine Learning and Big Data Approaches for Automatic Internet Monitoring
L'abstract è presente nell'allegato / the abstract is in the attachmen
Clustering and evolutionary approach for longitudinal web traffic analysis
In recent years, data-driven approaches have attracted the interest of the research community. Considering network monitoring, unsupervised machine learning solutions such as clustering are particularly appealing to let the network analysts observe patterns, and track the evolution of traffic over time. In this paper, we present a novel unsupervised methodology to automatically process and analyze batches of HTTP traffic, looking just at the URL structure. First, we describe IDBSCAN, Iterative-DBSCAN. We design it to obtain well-shaped clusters, and to simplify the choice of parameters — often a cumbersome step for the network analyst. Second, we show LENTA, Longitudinal Exploration for Network Traffic Analysis, which allows to automatically observe the evolution over time of traffic, naturally highlighting trends and pinpointing anomalies.
We first evaluate IDBSCAN and LENTA on synthetic data to compare their performance against well-known algorithms. Then we apply them on a real case, facing the analysis of hundred thousands of URLs collected from a live network. Results show both the goodness of clusters produced by IDBSCAN and LENTA ability to highlight changes in traffic, facilitating the analyst job